Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving

نویسندگان

  • Shai Shalev-Shwartz
  • Shaked Shammah
  • Amnon Shashua
چکیده

Autonomous driving is a multi-agent setting where the host vehicle must apply sophisticated negotiation skills with other road users when overtaking, giving way, merging, taking left and right turns and while pushing ahead in unstructured urban roadways. Since there are many possible scenarios, manually tackling all possible cases will likely yield a too simplistic policy. Moreover, one must balance between unexpected behavior of other drivers/pedestrians and at the same time not to be too defensive so that normal traffic flow is maintained. In this paper we apply deep reinforcement learning to the problem of forming long term driving strategies. We note that there are two major challenges that make autonomous driving different from other robotic tasks. First, is the necessity for ensuring functional safety — something that machine learning has difficulty with given that performance is optimized at the level of an expectation over many instances. Second, the Markov Decision Process model often used in robotics is problematic in our case because of unpredictable behavior of other agents in this multi-agent scenario. We make three contributions in our work. First, we show how policy gradient iterations can be used, and the variance of the gradient estimation using stochastic gradient ascent can be minimized, without Markovian assumptions. Second, we decompose the problem into a composition of a Policy for Desires (which is to be learned) and trajectory planning with hard constraints (which is not learned). The goal of Desires is to enable comfort of driving, while hard constraints guarantees the safety of driving. Third, we introduce a hierarchical temporal abstraction we call an “Option Graph” with a gating mechanism that significantly reduces the effective horizon and thereby reducing the variance of the gradient estimation even further. The Option Graph plays a similar role to “structured prediction” in supervised learning, thereby reducing sample complexity, while also playing a similar role to LSTM gating mechanisms used in supervised deep networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tactical Decision Making for Lane Changing with Deep Reinforcement Learning

In this paper we consider the problem of autonomous lane changing for self driving cars in a multi-lane, multi-agent setting. We present a framework that demonstrates a more structured and data efficient alternative to end-to-end complete policy learning on problems where the high-level policy is hard to formulate using traditional optimization or rule based methods but well designed low-level ...

متن کامل

Voltage Coordination of FACTS Devices in Power Systems Using RL-Based Multi-Agent Systems

This paper describes how multi-agent system technology can be used as the underpinning platform for voltage control in power systems. In this study, some FACTS (flexible AC transmission systems) devices are properly designed to coordinate their decisions and actions in order to provide a coordinated secondary voltage control mechanism based on multi-agent theory. Each device here is modeled as ...

متن کامل

Hybrid autonomous control for heterogeneous multi-agent system

Abstmcl-Reinforcement learning is an adaptive and flexible control method for autonomous system. In our previous works, we had proposed a reinforcement learning algorithm for redundant systems: "Q-learning with dynamic structuring of exploration space based on GA (QDSEGAY, and applied it to multi-agent systems. However previous works of the QDSEGA have been restricted to homogeneous agents. In ...

متن کامل

Virtual to Real Reinforcement Learning for Autonomous Driving

Reinforcement learning is considered as a promising direction for driving policy learning. However, training autonomous driving vehicle with reinforcement learning in real environment involves non-affordable trial-and-error. It is more desirable to first train in a virtual environment and then transfer to the real environment. In this paper, we propose a novel realistic translation network to m...

متن کامل

Cooperative Multi-Agent Systems from the Reinforcement Learning Perspective - Challenges, Algorithms, and an Application

Reinforcement Learning has established as a framework that allows an autonomous agent for automatically acquiring – in a trial and error-based manner – a behavior policy based on a specification of the desired behavior of the system. In a multi-agent system, however, the decentralization of the control and observation of the system among independent agents has a significant impact on learning a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1610.03295  شماره 

صفحات  -

تاریخ انتشار 2016